Vision Transformer
CLIP ViT-H/14
CLIP ViT-H/14
CLIP ViT-L/14
We release a new
CLIP ViT-G/14
CLIP
model with
OpenCLIP
which achieves 80.1%
zero-shot
accuracy on
ImageNet
and 74.9% zero-shot image retrieval (Recall@5) on
MS COCO
. As of January 2023, this is the best open source CLIP model.
https://t.co/TmVTUP3tBx
https://t.co/PMnpUUTNpc
LAION
https://gyazo.com/8156059952ad2cb62654ce8040937d8d
https://huggingface.co/laion/CLIP-ViT-bigG-14-laion2B-39B-b160k
LAION
https://arxiv.org/abs/2302.05442
単にViTというと
/motoso/Vision Transformer
だと思う
基素.icon